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Abstract. Image reconstruction methods based on exploiting image sparsity, motivated by 
compressed sensing (CS), allow reconstruction from a significantly reduced number of projections in 
X-ray computed tomography (CT). However, CS provides neither theoretical guarantees of accu- 
rate CT reconstruction, nor any relation between sparsity and a sufficient number of measurements 
for recovery. In this paper, we demonstrate empirically through computer simulations that mini- 
mization of the image 1-norm allows for recovery of sparse images from fewer measurements than 
unknown pixels, without relying on artificial random sampling patterns. We establish quantitatively 
an average-case relation between image sparsity and sufficient number of measurements for recovery, 
and we show that the transition from non-recovery to recovery is sharp within well-defined classes of 
simple and semi-realistic test images. The specific behavior depends on the type of image, but the 
same quantitative relation holds independently of image size. 
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1. Introduction. In X-ray computed tomography (CT) an object is to be re- 
constructed from projections obtained by measuring the attenuation of X-rays sent 
through the object. The forward problem, which describes how the projection data 
arises from the object, can be modelled either with an analytical formulation or an 
algebraic formulation. In the analytical formulation, the image is reconstructed by 
explicitly evaluating closed-form analytical inverses of the forward operator. Exam- 
ples of analytical reconstruction methods are filtered back-projection |21j and the 
Fcldkamp-Davis-Kress method for cone-beam CT [Tl]. Their main advantages are 
low memory demands and computational efficiency, which make them the current 
methods of choice in commercial CT scanners [3j [22] . 

In the algebraic formulation the forward operator is fully discretized, which leads 
to a large system of linear equations that must be solved in a stable fashion, typically 
by means of an iterative algorithm. The main advantage of the algebraic formulation 
is flexibility: It can handle geometries for which no analytical inverse is available, such 
as non-standard scanning geometries, and it allows for incorporation of various forms 
of prior knowledge about the object. Within the algebraic formulation there are many 
different kinds of reconstruction methods; some, such as the algebraic reconstruction 
technique (ART) and variants compute an approximate solution to the discretized 
linear system. These methods allow for simple constraints such as non-negativity. In 
optimization-based or variational methods the image is reconstructed by extremizing 
an objective function chosen to measure image desirability. These methods allow 
for incorporation of many types of prior knowledge into the objective function or 
constraints, such as image smoothness and knowledge of the statistical noise model. 

Motivated by the desire to reduce the exposure to radiation, there is a growing 
interest in low-dose CT, cf. [31] and references therein. This is relevant in medi- 
cal imaging to reduce the risk of radiation-induced cancer, and in biomedical imaging 
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where high doses can damage the specimen under study. It is recognized that the data 
reduction arising from low-dose imaging can sometimes be compensated for by incor- 
porating prior information about the image, which therefore calls for the algebraic 
formulation, typically in the form of optimization-based reconstruction. 

Developments in compressed sensing (CS) [6j [10] show potential for a reduction 
in data while maintaining or even improving reconstruction quality. This is made pos- 
sible by exploiting image sparsity; loosely speaking, if the image is "sparse enough" , 
i.e., having "sufficiently few" nonzero elements in some representation, it admits ac- 
curate reconstruction from "undersamplcd" data. We will refer to such methods as 
sparsity- exploiting methods. 

A typical cross-section image of the human body consists of well-separated areas 
of relatively homogeneous tissue, and same is true for poly-crystalline materials (such 
as metals) and for many other applications. Such images can be seen as having an 
approximately sparse gradient, and this property can be exploited in a reconstruc- 
tion algorithm by minimizing the total variation (TV) semi- norm |28| . Studies using 
simulated as well as clinical data have demonstrated that sparsity-exploiting methods 
indeed allow for reconstruction from fewer projections [U HI HU [27] Ell [30] . In spite of 
these positive results, we still lack a fundamental understanding of conditions under 
which such methods can be expected to perform well in CT. 

The present paper investigates a possible relation between the image sparsity and 
the number of CT views sufficient for recovery of the underlying true image. To 
simplify our discussion and analysis we focus on reconstruction of sparse images, i.e., 
images with a large number of zero pixels. We can apply £i-norm minimization of 
the image to enforce this kind of sparsity. These studies are interesting in their own 
right (see [19] for an example of ^i-norm based CT reconstruction of blood vessels), 
and they set the stage for forthcoming studies of TV and other types of sparsity. 

We are unaware of theoretical results that cover the mathematical model for CT. 
Inspired by work of Donoho and Tanner [9] we therefore use computer simulations 
to study recoverability by £i-norm minimization within well-defined classes of sparse 
images, that is, the average number of projections sufficient for recovering an image 
as function of the image sparsity. To this end, we introduce for a single image the 
concepts of strong recovery meaning that the image is reconstructed accurately (to 
within a chosen numerical accuracy) , and weak recovery meaning that only an image 
with same ^i-norm as the original is found. We also introduce the relative sparsity- 
sampling diagram for studying these quantities as function of image sparsity and 
number of projections, and we argue that these tools are of general value for studies 
of sparsity-exploiting methods. An advantage of our empirical approach (instead 
of relying on specific theoretical results) is that we can use the same technique to 
characterize systems of increasing levels of realism. Our findings for X-ray CT can be 
summarized as follows: 

1. Image recovery is possible with a structured sampling pattern as used in 
commercial CT scanners. 

2. There is a quantitative relation between image sparsity and sufficient sam- 
pling for recovery. 

3. There is a sharp transition from non-recovery to recovery. 

4. The relation varies with respect to different image classes. 

5. The relation holds independently of the image dimension. 

Another interesting result is that the gain in using a sparsity-exploiting method is 
significant, even for images with relatively many nonzeros. 
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Our paper is organized as follows. Section [5] gives the problem formulation and 
the sparsity-exploiting reconstruction method, and it introduces the concept of recov- 
crability in CT. Section [3] describes our numerical simulations, including details of 
the CT imaging model, generation of sparse phantoms, and how to robustly solve the 
reconstruction optimization problems. Section [4] presents an overview of our results, 
and we conclude with a discussion of the results in Section [5] 

2. Sparsity-exploiting reconstruction methods for CT. The purpose of 
this brief section is to define the notation, the algebraic formulation, and the recon- 
struction method used throughout the study. 

2.1. Constrained algebraic reconstruction. We consider the discrete inverse 
problem of recovering a signal x or ig £ R N from (usually noisy) measurements b £ IR M . 
The imaging model, which is assumed to be linear and discretc-to-discrete [2], relates 
the image and the data through a system matrix A £ R MxAr , 

b = Ax + e, (2.1) 

where e £ R M represents additive noise, and the elements of x £ R N are pixel 
values stacked into a vector. To solve (|2.1j) it is often necessary to impose some sort 
of regularization in order to reduce noise amplification in the inversion as well as 
to restrict the set of solutions in case of an underdetermined and/or rank deficient 
problem. A very common type of regularization takes the form: min^ J{x) s.t. 
H-Aa: — b\\\ < e 2 , where J(x) is a regularize^ i.e., a function selected to impose some 
condition on the image that reflects prior knowledge or assumptions. In this work we 
use J(x) = ||a;||i, which is known in many cases to produce a sparse x, as discussed 
below. 

The regularization parameter e reflects the amount of noise in the data, and in 
the limit e — > we obtain the equality-constrained problem: 

LI: min ||a;||i s.t. Ax = b. (2.2) 

X 

The corresponding problem L2 with J(x) = \\x\\\, which corresponds to Tikhonov 
regularization, gives the unique minimum- norm solution, i.e., the vector of minimal 
^2-norm among all vectors satisfying A x = b. The problem LI can have non-unique 
solutions, an issue which we will address in Section [3T^1 

The inequality-constrained problem is of more practical interest than (|2 -2[) be- 
cause it allows for noisy and inconsistent measurements, but its solution depends in 
a complex way on the noise and inconsistencies in the data, as well as the choice 
of the parameter e. Studies of the equality-constrained problem (|2.2p . on the other 
hand, provide an basic understanding of a given regularizer's reconstruction poten- 
tial, independent of specific noise. Therefore, we focus in the present study on the 
ideal equality-constrained problem. This means that we do not consider noisy data, 
uncertainties in the system matrix, or the influence of a regularization parameter. 

2.2. Recoverability of problem instances. The interest in LI (as well as 
TV and other sparsity-exploiting methods) is motivated by recent developments in 
CS demonstrating that it is possible to recover x or i g from a reduced number of mea- 
surements [6l [10] . The underlying assumption is that the image is has few non-zero 
pixels, or that it is sparse in some other representation (such as after applying a dis- 
crete gradient transformation to the image). We call a vector x £ Wi N with k non-zero 
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elements fc-sparse and we define 



relative sparsity: 



K = k/N. 



(2.3) 



Moreover, we call a tuple (x or ig, A, b = Ax OT i g ) a problem instance and say that it is 
recoverable if the solution a;* to LI is identical to x rig- 

There has been much work on establishing results stating for which matrices A 
problem LI is capable of recovering the sparsest solution, cf. j5] and references therein. 
We give an example of such a theorem based on the mutual coherence [ia of a matrix A: 
For a full-rank A € M. MxN with M < N, if a fc-sparsc solution x to Ax — b satisfies 



(where ae is the £th column of A), then it is the unique solution to LI; a smaller /ia 
leads to a larger bound on the sparsity of a signal that is guaranteed to be recovered. 
Many theoretical recovery results exist, most notably relying on the so-called "spark" 
[TT] and restricted isometry property [7] of a matrix. While some results are deter- 
ministic, many of them are probabilistic in the sense that if the elements of A are 
selected at random from certain probability distributions, then with "overwhelming 
probability" LI will recover the original signal [6]. 

It is generally accepted that these theoretical results are of limited practical 
use [13], since the requirements are generally NP-hard to check [24], and/or they 
provide very pessimistic bounds on the sparsity of signals that can be recovered. 
Better results are available for certain special matrices, such as those with elements 
drawn from a Gaussian distribution, but those results do not carry over to the system 
matrices encountered in CT. Instead, recoverability can be studied empirically Such 
studies have been conducted for certain practical systems, (see, e.g., [U [23]), but 
we are unaware of any systematic recoverability studies specifically for CT system 
matrices. 

Our empirical study is inspired by the work by Donoho and Tanner (DT) [9] who 
studied empirical recoverability using a phase diagram of the (5, p)-plane, where: 

undersampling fraction: S = M/N, sparsity fraction: p = k/M. (2.4) 

For certain classes of random matrices DT were able to prove existence of a sharp 
transition from non-recovery to recovery, and verify the result in empirical studies. 
Although we do not derive similar theoretical results for CT matrices, we can still 
conduct similar empirical recoverability studies using the DT phase diagram and the 
related relative sparsity-sampling (RSS) diagram, which we introduce in Section 2] 

We note that it is possible to construct examples of A:-sparse vectors for small k 
that cannot be recovered from CT measurements |12l 125] , implying that we cannot 
hope to obtain useful results on guaranteed recovery of all fc-sparse images. However, 
these constructed vectors might be pathological and very different from actual im- 
ages occurring in practice, and for this reason we are more interested in determining 
average-case recovery results for specific classes of images. 

We will empirically establish a quantitative relation between the number of mea- 
surements and the sparsity of a; or i g sufficient for recoverability. In order to do that, 
we conduct randomized simulations where we generate ensembles of images of vary- 
ing sparsity and CT system matrices corresponding to varying number of projections; 
then we use LI for reconstruction, and we check for recovery. Since different phantoms 
of same class and sparsity might require different number of views to be recovered, 
we are interested in the average-case recovery over the phantom ensembles. 
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3. Experiment design. In this section we describe the test problems used in 
our studies, as well as our approach to solving the reconstruction problem numerically. 

3.1. CT imaging geometry. Different scanning geometries are used depending 
on the manufacturer and scanner type. When designing a reconstruction algorithm 
using a given set of measured projections, one can adjust the number of pixels to 
trade-off resolution for signal-to-noise ratio, re-bin or interpolate the data to obtain 
additional "data" , use other basis functions than pixels, etc. 

We consider a generic geometry to serve as an example: the standard 2D fan- 
beam geometry with equi- angular views as illustrated in Figure l3~Tl Due to rotational 
symmetry we restrict the image to be within a circular mask inside an N s id c x N s i<i e 
square domain; the number of pixels in the masked image is approximately N = 
|~7r/4 • A r s 2 ido ]. The source-to-detector distance is 2N s id c , and the fan-angle 28.07° is 
set to precisely illuminate the circular mask. The number of views (or projections) is 
denoted N v . The rotating detector is assumed to consist of = 2iV s jd G bins, so the 
total number of measurements is M = 2N s id e N v . Each measurement is modeled with 
a linear equation of the form 

N 

b i = Y^A ij x j , i = l,...,M, (3.1) 

3=1 

where Ay is the path length of the ith ray through the jth pixel (this is the so-called 
line intersection method of discrctizing the forward operator) . The system matrix A 
is then M x N, and it is computed by means of the MATLAB package AIR Tools [TT] . 




Fig. 3.1. The fan-beam geometry with views from full angular range. Pixels outside the disk 
(shown as black) are not considered part of the image. Left: disk diameter N s ^ c = 32 and N v = 3 
views. Right: disk diameter N 3 ^ e = 64 and N v = 5 views. To improve the visualization, the number 
of rays per view has been reduced compared to the choice M = 2A r si( j e used in the simulations. 

3.2. Sparse phantom classes and generation of instances. By a class of 
phantoms we mean a set of phantom images that are described by a set of specifi- 
cations, in such a way that we can generate random realizations from the class. We 
refer to such an image as a phantom instance from the class, and multiple phantom 
instances from the same class are said to form a phantom ensemble. 

3.2.1. The "spikes" and "signedspikes" classes. For the spikes class, given 
an image size N and a target relative sparsity k, wc generate a phantom instance 
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Fig. 3.2. Sparse phantom image instances of class spikes (1st row, black is 0, white is 1), 
and of class signedspikes (2nd row, black is —1, white is 1). Relative spar sity from left to right is 
K = 0.05, 0.10, 0.20, 0.40, 0.60, 0.80. 

x e M N as follows: Given a zero vector x, randomly select k = round(ftiV) pixel 
indices, and set each selected pixel value in x to a random number from a uniform 
distribution over [0, 1]. Figure [XH shows examples of spikes phantom instances for 
varying n. This class is deliberately designed to be as simple as possible and it does 
not mimic any particular application. It is used to study the generic case of having 
a sparse image, and thus provide basic understanding before pursuing any specific 
application with more realistically looking phantoms. 

The signedspikes class is essentially identical to the spikes class, except the 
nonzero pixel values are uniformly distributed over [—1,1]. In standard CT recon- 
struction the attenuation coefficient is always non-negative, but in certain applications 
a background attenuation is subtracted, thereby causing x to have both positive and 
negative values. As we will show in Section|4l the modification to allow negative pixel 
values leads to a significant change in the recoverability. 

3.2.2. The "p -power" class. As a phantom class that models a more realistic 
type of images, we choose a model of breast tissue. The general idea is to introduce 
some structure to the non-zero pixels by creating correlation between neighboring 
pixels. Our procedure is based on [26] followed by thresholding to obtain many pixels 
that are strictly equal to 0. The amount of structure is governed by a parameter p 
and we refer to this parameterized family of phantom classes by p -power: 

1. Create a N sidc x N a ^ e phase image P with values drawn from a Gaussian 
distribution with zero mean and unit standard deviation. 

2. Create an iV S id e x ^Vsidc amplitude image U with pixels 

U(i,j) = (u 2 + u^-p/ 2 , i,j = l,..., N sidc , 

where u e = -l+ (21 - 1)/N, 1 = 1,.. .,N side . 

3. For all pixels compute F(i,j) = U(i, j')e 27rlP ^'^. 

4. Compute the magnitude of the 2D inverse discrete Fourier transform of F. 

5. Restrict this square image to the disk-shaped mask. 

6. Determine the number of non-zeros k = round(reiV). 

7. Keep the k largest pixel values and set the rest to 0. 

Figure l3~3l shows examples of phantoms from classes 1.0-power and 2.0-power. We 
see that the 1.0-power phantoms have more structure than the spikes phantoms, 
and 2.0-power has even more structure. 




3.3. Robust software for solving the optimization problems. The robust- 
ness of a numerical solution for deciding recovery depends on the solution accuracy. 
False conclusions may result from incorrect or inaccurate numerical solutions. To ro- 
bustly solve the optimization problems LI and L2 we must therefore use a numerical 
method which gives a clear indication of whether a correct solution, within a given 
accuracy, has been computed. Our choice is the package MOSEK [5D], which uses a 
primal-dual interior-point method suitable for large-scale sparse problems. MOSEK 
is equipped with numerous sophisticated features to handle numerical difficulties, and 
it issues warnings when it fails to compute an accurate solution. In all problem in- 
stances considered in our simulation studies, MOSEK managed to return a certified 
accurate solution. 

Solving L2 with MOSEK is straightforward, since rewriting H^Hl = x T x allows us 
to cast L2 as a quadratic program, which is readily solved by MOSEK. To solve LI 
using MOSEK we cast it as a linear program (LP) by introducing a new variable vector 
to G M. bounding the magnitude of x's elements from above and below, — w < x < w, 
which implicitly makes w non-negative. Minimizing l T w for 1 G M. N , while respecting 
the original constraint A x — b as well as —w < x < w, is an LP which is reliably 
solved by MOSEK. 

3.4. Strong and weak LI recovery. While L2 has a unique solution because 
\\x\\2 is strictly convex, it is not necessarily the case for LI. In a specific imaging 
application, the system matrix and the image are given, and the arising problem 
instance might cause LI to not have a unique solution. To handle this issue, given 
a specific problem instance (A, x or i gl b = ^4x or i g ) we consider two different notions of 
recovery for the computed solution x* to LI: 
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• Strong recovery: x* is identical to the original: x* = x or ; g . 

• Weak recovery: x* has the same optimal value as £ or i g : ||a;*||i = H^origlli- 
To motivate the distinction between different types of recovery we consider the possible 
outcomes from solving LI: 

1. > ||x or jg||i. This is not possible, since the smaller objective of a; or i g 
would cause the robust optimization algorithm to return x or i g in favor of x* . 

2. |[x*||x < ll^origlli- Another image, x*, with smaller objective than x or i g exists, 
so there is no hope for recovery with LI. 

3. ||x*||i = [^'origlll- This is weak recovery, so we know that x OI i g is at least a 
minimizcr. We do not know whether the found minimizer x* is identical to 
£orig, because we do not know whether the Ll-solution is unique. If x OT i g is 
indeed the unique Ll-solution, then x or ig is sometimes called identifiable |12j . 
It is possible to test for Ll-identifiability by verifying a set of necessary and 
sufficient conditions [15], but for general J{x) we are not aware of any general 
test, and we leave uniqueness tests for future work. Given weak recovery we 
test for strong recovery. The possible outcomes are: 

• x* = x or ig. With strong recovery, the original is recovered. This does 
not imply ident inability, so we can not be certain that another choice of 
optimization algorithm would also yield recovery. 

• x* x or i g . Without strong recovery, a different image with same ob- 
jective as the original was found, so the LI solution can not be unique. 
This, in itself, is valuable information. Additionally, it gives two strate- 
gies for potentially recovering the original. First, a different optimization 
algorithm may be able to recover the image (for instance, an interior- 
point method tends to select solutions in the interior of faces on the 
LP simplex, while a simplex- method selects vertices of the LP simplex.) 
Second, it may be possible to restrict the feasible set by introduction of 
constraints, such as nonnegativity, reflecting any prior knowledge of the 
image, and thereby recover the original. 

We want to emphasize here that the decision of strong recovery is dependent on the 
choice of optimization algorithm and its internal parameters (such as the initial point 
and stopping rules). In contrast, weak recovery is independent of the algorithm, as 
long as the optimization algorithm is robust. In our simulations, we will therefore — in 
addition to strong recovery results — also report weak recovery results. 

3.5. Simulations. Given the imaging model, the procedure for generating sparse 
phantom classes, and a robust optimization algorithm, we are in the position to carry 
out randomized simulation studies of recoverability within a phantom class, as a func- 
tion of relative sparsity and number of views. The design of a single basic simulation 
consists of the following steps: 1) Generate an instance (A, x or i g , b — Ax OI [ g ), 2) solve 
LI numerically to obtain x*, and 3) test for strong recovery (x* = £ rig) and weak 
recovery (||x*||i = ||x rig||i) using 

u**-*o rig || 2<e and ^iu_wM <£; (32) 



Ik 



orig||2 



where the threshold e is chosen based on the accuracy of the optimization algorithm. 
We used e = 10 -4 , and this choice is studied in Section l4~4l 



4. Simulation results. We start by introducing some notation that is useful 
for the following discussion. For a given problem instance, the sufficient view number 
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N* denotes the smallest number of views that gives a full-rank matrix A and thus 
recovery with L2. The LI recovery view number N^ 1 denotes the smallest number 
of views for which strong recovery is observed for all N v > N^ 1 . Using N* uf as a 
reference point for full sampling, we define two useful quantities: 

relative sampling: fx =N V /N° uf , (4.1) 
relative sampling for recovery: fi 1 " 1 = N^ 1 / N^ ui , (4-2) 

cf. the relative sparsity k defined in (12.31) . 

One of our main contributions is the so-called relative sparsity-sampling (RSS) 
diagram, introduced in Scction [4.21 in which the recovery percentage over an ensemble 
of phantoms is plotted as function of the relative sparsity and the relative sampling. 
Strong and weak recovery RSS diagrams reveal a sharp transition from non-recovery 
to recovery and a monotonically increasing relation between relative sparsity and 
relative sampling for recovery. The subsequent sections study how the RSS diagrams 
change with variations to the recovery criteria, image size, phantom class and diagram 
resolution; a summary of results is given in Section 14.81 




Fig. 4.1. Reconstructions of the spikes phantom with AT s i^ e = 64, relative sparsity k = 0.20. 
1st row: L2 reconstructions (black is 0, white is 1 ). 2nd row: L2 minus original image (black is 
—0.1, white is 0.1). 3rd row: LI reconstructions (black is 0, white is 1 ). J^th row: LI minus original 
image (black is —0.1, white is 0.1). Columns: 4, 8, 10, 12, 24 and 26 views. 

4.1. Recovery from undersampled data. We first establish that LI is ca- 
pable of recovering an image from undersampled CT measurements, and we compare 
with the L2 reconstruction. We use a phantom x or i g from the spikes class with 
-/Vside = 64, leading to N = 3228 pixels inside the mask. The relative sparsity is 
set to k = 0.20, which yields 646 non-zeros. We consider reconstruction from data 
corresponding to 4, 6, 8, ... , 32 views; the smallest and largest system matrices are of 
sizes 512 x 3228 and 4096 x 3228, respectively. Selected reconstructed images from 
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Fig. 4.2. Strong (top row) and weak (bottom row) recovery criterion measures from J 3 - 2 1) vs. 
view numbers for LI and L2 reconstruction of spikes phantoms with relative sparsity values n = 0.2, 
0.4 and 0.6. 



solving LI and L2 are shown in Figure fPl along with the difference image w.r.t. x or i S 
to better visualize the transition to recovery. 

Recall that the minimum- norm L2 solution is typically non-sparse [T3]. Therefore, 
we expect to need a full-rank system matrix for L2 to recover the original sparse 
image, and our results confirm this: the L2 reconstructions gradually improve with 
more views but recovery is not observed until A v = iV^ uf = 26. At A v = 24, the size 
is 3072 x 3228 and rank(A) = 3052, and the minimum-norm L2 solution is not the 
original. At 7V V = 26, the matrix is 3328 x 3228 and full-rank, so a; or i g is recovered. 

For LI, recovery occurs already at 7V V = A^ 1 = 12, where A has size 1536 x 3228 
and rank 1524. Evidently, in spite of the large null space of A, in this case, LI selects 
the original image. When A^ v increases, LI continues to recover the original up to 
and beyond N v = = 26, where the matrix becomes full-rank. Figure |4~T1 thus 

demonstrates the potential of LI recovery from undersampled CT measurements. 

In order to investigate quantitatively a possible relation between image sparsity 
and sufficient sampling for LI recovery we repeat the k = 0.2 experiment for k — 0.4 
and 0.6. Figure l4~2l shows the strong and weak recovery criterion measures from (|3.2|) 
for LI and L2 as a function of N v . For L2 the behavior is independent of the relative 
sparsity, as expected. For LI, on the other hand, A^ 1 takes the values 12, 16 and 
20, indicating a very simple relation between sparsity and LI recovery view number. 
For the problem instances considered here strong and weak recovery coincide but, as 
we will see below, that is not always the case. 

4.2. Recoverability studies using the DT diagram. In general, we can not 
expect all phantom instances of same relative sparsity to have the same A^ 1 , and 
in fact we observe some variation. One way to study this variation is through the 
DT phase diagram described in Section l2~2l For the spikes class with A S jd e = 64 we 
consider reconstruction with undersampling fraction S = M/N for M = A v Ab, where 
A^b = 2A r s ido and N v = 2, 4, 6, ... , 32. At each A v we consider sparsity fractions p = 
k/M = 1/16, 2/16, . . . , 16/16. We test for strong and weak recovery for reconstruction 
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Strong recovery, spikes, e=1,0e-04 Weak recovery, spikes, e=1 ,0e-04 

N .. =64, repts=100, N=26 N .. =64, repts=100, N s =26 

side r v side r v 




using the same system matrix A for 100 phantom instances at each (5, p). 

The resulting diagrams with the percentage of instances recovered at each (5, p) 
are shown in Figure 14.31 The strong recovery DT phase diagram shows that at very 
low sampling (A v = 2 and 4, i.e., 5 = 0.08 and 0.16) no phantoms can be recovered. 
An important observation is the very sharp phase transition from non-recovery to 
recovery, meaning that the variation of A^ 1 within the same phantom class is very 
limited. To the best of our knowledge, a similar analysis has not been done for CT- 
matrices before, and we therefore believe the sharp transition to be a new observation. 
Interestingly, in the weak recovery DT phase diagram, the transition extends all the 
way to the low sampling cases; showing potential for recovery from very few samples, 
provided one can find a way to restrict the solution to be unique. 

Normalizing the sparsity by the number of measurements causes a small problem 
for the diagram. When S > 1, values of p close to 1 can lead to k > N, for which it is 
impossible to construct instances. Those cases are shown with x -symbols in the DT 
diagrams. 

4.3. The relative sparsity-sampling (RSS) diagram. As an alternative to 
the DT phase diagram we can visualize the percentage of instances recovered in the 
(k, /x)-plane for the relative sparsity k = k/N and the relative sampling p = A v /A™ f . 
We refer to such a diagram as a relative sparsity-sampling (RSS) diagram. Figure l4~4l 
shows RSS diagrams for strong and weak recovery, corresponding to the DT phase 
diagrams in Figure FOl created by reconstructing 100 spikes phantoms for k = 0.025, 
0.05, 0.1, 0.2, . . . , 0.9 and 2V V = 2,4, 6, . . . , 32. 

Again, the variation of A^ 1 over phantom instances is almost negligible, since 
there is a sharp transition from non-recovery to recovery. The relative sampling for 
recovery /i L1 increases monotonically with the relative sparsity k, although not in a 
linear way. As k — > 0, the relative sampling for recovery /Lt L1 also approaches 0, and 
similarly /x L1 — > 1 for k — > 1, confirming that when the image is no longer sparse, LI 
gives no advantage over L2. 

The RSS diagram also gives quantitative information on the recovery view number 
for LI. Assume, for example, that we are given an image of relative sparsity k = 0.1, 
how many views would suffice for recovery? The RSS diagram shows that at k = 0.1, 
we have /U L1 = 0.31, which corresponds to A^ 1 = 8 views. If the phantom has 
k = 0.6, we obtain ^i L1 = 0.77 and A^ 1 = 20 views. Note that the RSS-diagram 
works equally well for answering the opposite question, namely, what is the maximal 



12 



J. H. J0RGENSEN, E. Y. SIDKY, P. C. HANSEN, AND X. PAN 



Strong recovery, spikes, e=1,0e-04 Weak recovery, spikes, e=1JDe-04 Average \i , spikes, e=1,0e-04 

N^ de =64, repts=100, N= uf =26 N side =64 ' re P ts = 10 °. K = 26 N = 64 ' repts=100, N= u, =26 




Relative sparsity: k = k / N Relative sparsity: k = k / N Relative sparsity: k = k / N 



Fig. 4.4. Left: Relative sparsity- sampling (RSS) diagram for strong recovery of the spikes class 
at iVside = 64; the colorscale is as in Figure \4~3\ Middle: Corresponding weak recovery RSS diagram. 
Right: Average relative sampling for strong and weak recovery over the phantom instances. 

relative sparsity that will allow recovery from 20 views? 

To a great extent, the strong and weak recovery RSS diagrams in Figure l4~4l look 
similar, but small differences exist, e.g., at k = 0.2. Recall that strong recovery can 
depend on the optimization algorithm used. To ensure strong recovery with MOSEK, 
we need p = 0.46 to obtain 100% recovery, while only 2% are strongly recovered at 
p = 0.38. For weak recovery, on the other hand, p = 0.38 suffices to recover 19% of the 
instances. The difference between the two types of recovery is even more pronounced 
at k — 0.9, where only 13% have strong recovery at p = 0.92, while 100% are weakly 
recovered. 

The similarities and differences between the RSS diagrams are emphasized in 
the rightmost plot in Figure 14.41 where the average ft 1,1 over all instances at each 
k is plotted. Since strong recovery of an instance implies weak recovery, the strong 
recovery curve will always be bounded from below by the weak recovery curve. 

Clearly, the RSS diagram introduced here is strongly inspired by the DT phase 
diagram, but for several reasons we find the RSS diagram more intuitive to interpret 
for our CT applications: 

1. A definition of the undersampling fraction relative to N, as used in the DT 
phase diagram, only makes sense when M = N yields a full-rank matrix. 
This is not necessarily the case in CT, so a slightly different measure of 
undersampling, ft = N v /N* ut , is required. (For the CT geometry studied 
here, S ft because the iV v that yields full rank is close to M = TV; but to 
be precise we still make the distinction.) 

2. The sparsity fraction p for the DT phase diagram is relative to the number 
of measurements M, whereas our sparsity k is relative to the number of 
pixels N . In the DT phase diagram, our relative sparsity is constant along 
hyperbolic curves instead of straight lines; to see this, note that a constant 
k = c is equivalent to pS = c, which is a hyperbola in the DT phase diagram. 
This means that the two diagrams are essentially different ways of visualizing 
the same data, only for slightly different ranges of the sparsity parameter. 
We find the RSS diagram more intuitive to use for our purposes, because 
the quantities of interest — the relative sparsity and sampling — can be read 
directly on the coordinate axes. 
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3. The DT phase diagram is typically used for studying randomly generated 
matrix instances A in addition to image instances x or i g , with the objective 
to understand recoverability over a whole matrix ensemble. For CT, on the 
other hand, we are interested in recoverability with a fixed matrix reflecting 
the given data acquisition geometry of the scanner in question, and the RSS 
diagram provides a mean for studying this situation: It attempts to answer 
how many views is sufficient for recovery of a phantom with a given (relative) 
sparsity for using a fixed choice of system matrix. 

For these reasons, the remaining part of the article will solely use the RSS diagram 

to visualize the recovery results. 




0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 




0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 



Fig. 4.5. Strong (top) and weak (bottom) recovery RSS diagrams for class spikes, 7V s id e = 64, 
for e = 1CT 2 , 1CT 3 , 1CT 4 , lCr 5 , ltr 6 (left to right). Axes labels and titles are omitted for improved 
visualization. Abscissas: relative sparsity k. Ordinates: relative sampling fj,. 100 phantoms. 

4.4. Dependence on the threshold. The decision of recovery in the face of 
numerical computations (and rounding errors) depends on how close the computed 
solution is to the original, as measured by the threshold e. As shown in Figure U2J the 
transition from non-recovery to recovery is often very distinct, and often recognized 
by a large jump in error norm (of five or more orders of magnitude) — although in 
some cases the transition is more gradual, cf. the lower left-most plot. 

Figure shows the strong and weak recovery RSS diagrams for a varying thresh- 
old e = 1(T 2 , 1(T 3 , 1(T 4 , 1CT 5 , 10" 6 . Very little difference is seen among the three 
middle values; there are only minor variations on the location of the transition curve, 
due to few instances located on the border line. The threshold e = 10~ 2 is too large, 
since the weak RSS diagram shows recovery for a too large number of instances. For 
the smallest threshold e = 10 -6 , the transition curve remains similar to those in the 
mid-range; but certain instances well above the transition curve are not recovered. 
While this could indicate a real phenomenon, we are very confident that it arises when 
we approach the obtainable numerical accuracy of the optimization algorithm. We 
conclude that the appearance of the RSS diagram is not very sensitive to the choice 
of e, and e = 10~ 4 is a good choice for the chosen algorithm. 

4.5. Dependence on image size. To study how the RSS diagram depends on 
image size, we construct additional diagrams for TVgidc = 32 and 128. For iV s id = 32 
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Fig. 4.6. RSS diagram dependence with image size. Top: N a i^ e = 32; bottom: N s idc = 128; left: 
strong recovery; middle: weak recovery; right: average relative sampling for strong/weak recovery. 



we can use the same relative sampling as for iV s id = 64 by taking N v — 1, 2, 16, 
since the matrix becomes full-rank at N* ui = 13. For iV s idc = 128 we have N^ ul = 51, 
so by taking N v = 4, 8, . . . , 64 we obtain approximately the same relative sampling. 

The two additional RSS diagrams are shown in Figure 14.61 Overall, we see the 
same monotone increase in ft 1,1 with increasing k. For N^c = 32, however, the 
transition from non-recovery to recovery is slightly more gradual, and the cases with 
the smallest k have a larger // L1 . 

The following observation illustrates the need to distinguish between strong and 
weak recovery RSS diagrams. For k = 0.025 and 0.05 the value ft, = 0.23 is sufficient 
for strong recovery, but adding one more view to obtain ft = 0.31 destroys recovery. 
This seemingly counter-intuitive phenomenon is explained by the geometry underlying 
the data acquisition: Going from 3 to 4 views is not done by including an additional 
view to the existing views; rather the 4 views are distributed equi-angularly around 
the image, and hence the two system matrices are different. For this problem, 3 
views provide enough data to recover the image, whereas the 4 views happen to be 
unfortunate. In the weak recovery RSS diagram, however, we see that both 3 and 4 
views yield reliable weak recovery. We conclude that at 4 views the original image is 
indeed a solution, but it is non-unique and MOSEK happened to compute another 
solution than the one used to generate the data. 

The RSS diagram for A^dc = 128 is similar to the -/V s jd e = 64 case, except for 
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slightly better recovery at the extreme k- values. This happens precisely where the 
strong and weak recovery RSS diagrams differ for N a ^ e — 64, while at N B ^ e = 128 
we do not see this difference and accordingly the values of /i sufficient on average for 
strong and weak recovery agree. When comparing the three average-case figures, the 
weak recovery curves are almost identical, and the strong recovery curves approach 
the weak when 7V S id e increases. The differences for iVgide = 32 are most likely caused 
by discretization effects. Since the gap between the two curves closes, we conclude 
that for larger images the problem of having non-unique solutions is less pronounced. 
Moreover, iV S id e = 64 is sufficiently large to give representative results that can be 
extrapolated to larger A^ide, especially when considering weak recovery. 



Strong recovery, 1 .0-power, e=1 .Oe-04 Weak recovery, 1 .0-power, e=1 .Oe-04 Average n , 1.0-power, £=1. Oe-04 
S =64, repts=100, N* u, =26 N si( je =64 ' re P ts = 10 °. N „ = 26 N =64, repts=100, N= u, =26 




Relative sparsity: k = k / N Relative sparsity: k = k / N Relative sparsity: k = k / N 



Fig. 4.7. B.SS diagrams for classes 1.0-power (top) and 2. 0-power (bottom). 

4.6. Dependence on the phantom class. As argued in Section l2~2l we cannot 
expect recovery of all fc-sparse images at a given relative sampling — at least only 
with very few non-zeros and unfavorably large relative sampling, due to the existence 
of pathological phantoms violating otherwise typical sufficient sampling. Hence, we 
study recoverability only for well-defined classes of phantom images. 

Figure |4~71 shows RSS diagrams for the 1.0-power and 2. 0-power classes for 
N s ide = 64. Comparing with the spikes RSS diagrams in Figure 14.41 we observe 
similar trends. For 1.0-power the transition from non-recovery to recovery occurs at 
the same («, /i)-values, but is slightly more gradual at a few K-values. For 2. 0-power 
the transition is even more gradual, and occurs at lower /i-values for the mid and 
upper range of k. We conclude that, on average, a smaller number of views suffices 
but the in-class recovery variability is larger. The spikes and p -power RSS diagrams 
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are similar, but there are also some differences and we conclude that sparsity docs not 
fully determine recoverability; the structure of the nonzero locations also plays a role. 
The RSS diagrams can be used to study variation with structure and to determine if 
two classes have similar recoverability. 

To further study how the recoverability depends on image class, we consider the 
RSS diagrams for the signedspikes class in Figure 14.81 Here, the transition is very 
sharp and the difference between strong and weak recovery is negligible. However, 
the transition occurs at much larger /i-values than for the spikes class in Figure 
14.41 For example, at k = 0.4 spikes has average ^ L1 = 0.62 compared to 0.77 for 
signedspikes. At k = 0.8 spikes still has undersampled recovery, although only 
at average ^t L1 = 0.92, compared to no undersampling admitted for signedspikes. 
We conclude that signedspikes images are harder to recover, and the RSS diagram 
allows us to quantify how much harder, which we consider a useful property. 



Strong recovery, signedspikes, e=1 .0e-04 Weak recovery, signedspikes, e=1.0e-04 
N =64, repts=100, N Suf =26 N sid e= 64 ' re P ts = 100 ' N v = 26 



Average , signedspikes, e=1.0e-04 
N=64, repts=100, N su, =26 
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Fig. 4.8. RSS diagrams for the class signedspikes. 
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Fig. 4.9. High-resolution RSS diagrams for the spikes class at N s ^ 



64. 



4.7. Dependence on diagram resolution. The RSS diagram can, among 
other things, be used to predict the recovery view number for images of a given 
relative sparsity. Assume the task is to recover a spikes image of expected relative 
sparsity 0.35; how many views should be used? The RSS diagrams in Figure l4~4l do 
not explicitly reveal the answer since k = 0.35 is not included, but linear interpolation 
of the average ^i L1 = 0.54 at n = 0.3 and 0.62 at k = 0.4 predicts ^i L1 = 0.58. 
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We can verify this by increasing the resolution of the RSS diagram. Figure 14.91 
shows diagrams for k = 0.025, 0.05, 0.10, 0.15, . . . , 0.90, 0.95 and N v = 1, 2, 3, . . . , 32. 
Overall the trend agrees well with the lower-resolution RSS diagrams, but some finer 
details become apparent. Most notably, at k — 0.025 we recognize the non-unique 
solution at 4 views that we saw for N^c = 32 in Figure 14.61 (it seems that there is 
some intrinsic property of the 4- view geometry that causes the solution to be non- 
unique). Apart from that, we still observe a sharp transition from non-recovery to 
recovery throughout the K-range. For k = 0.35, we find the average relative sampling 
for recovery to be // L1 = 0.58, thereby verifying our result interpolated from the 

lower-resolution RSS diagram. 

In the interest of keeping the computational time required for determining the RSS 
diagrams reasonably low, we find — based on examples such as the given at k = 0.35 — 
that the RSS diagram in Figure l4~4l is sufficiently well- resolved to locate the transition 
from non-recovery to recovery. If more computing resources are available, instead of 
increasing the diagram resolution it may be more advantageous to consider a larger 
ensemble of phantoms, to increase the reliability of the diagram. 

4.8. Summary of results. We used simulations to demonstrate the use of the 

RSS diagram for empirical quantification of the relation between image sparsity and 
the number of views sufficient for Ll-recovery from CT data. For the spikes class 
we observed a sharp transition from non-recovery to recovery throughout the relative 
sparsity range and, as function of relative sparsity, a monotonically increasing relative 
sampling for recovery. We saw that the RSS diagram is not sensitive to the threshold 
used in the recovery criterion, and that the transition from non-recovery to recovery is 
independent of the image size. We found similarities between RSS diagrams for differ- 
ent phantom classes, and the transition becomes smoother as the phantom structure 
increases. In summary, we believe that RSS diagrams are useful for understanding 
sparsity-exploiting reconstruction because they are: 

• a structured framework for establishment and quantification of a relation 
between sparsity and sufficient sampling for a particular system, 

• independent of any theoretical results guaranteeing solution uniqueness, and 

• a way to study large systems through extrapolation from smaller systems. 
In addition, the RSS diagram can easily be extended to studies of sparsity-exploiting 
methods with sparsity in other representations, with other penalty functions, relax- 
ation of the equality constraint to admit noisy /inconsistent data, etc. 

5. Discussion. Our CT simulation studies show that for a range of different 
sparse phantom classes it is possible to observe a sharp transition from non-recovery 
to recovery, both in the strong and weak sense. In light of the pessimistic theoretical 
recovery results mentioned in Section |2.2[ it is perhaps surprising that such a sharp 
transition exists. 

The simulation results speak only of the average-case recoverability, so there 
might still exist worst-case instances within our phantom classes that require far 
more samples than the average-case. Such pathological images are very unlikely to be 
realized in our simulations, so their influence can not be accounted for. On the other 
hand, such pathological phantoms might be very unrealistic from a practical perspec- 
tive and therefore one can argue that they should be omitted for better understanding 
the behavior of the method when applied to "typical" images. 

5.1. Limitations and extensions. The present ideal-case studies are far too 
unrealistic to provide any direct guidance on reconstruction quality and sampling in 
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a full-scale realistic CT application. Doing so is indeed the ultimate goal, but the 
problem has too many facets and too many parameters to analyze in enough detail 
to arrive at a complete understanding. Moreover, such results are necessarily very 
application-specific; for instance, the 2-norm metric (|3.2[) for image comparison is not 
appropriate for evaluating the practical utility of an image in a specific application. 

Our intention here is to take a first step in this direction by proposing to carry 
out studies of particular systems of interest, and to provide an analysis for a simple 
but easily gencralizablc set-up. Our quantitative conclusions are, of course, only valid 
for the specific geometries, algorithms, data, and phantom classes. 

The drawback of insisting on using a robust optimization algorithm is the lim- 
itation it enforces on the image size in the simulations; with MOSEK, we found 
-^Vsidc > 128 to be impractical. We do not advocate MOSEK for larger systems than 
the ones considered here; faster algorithms that are applicable to large-scale problems 
exist, and potentially we only pay a price of reduced robustness. Also, the possibility 
to extrapolate RSS diagrams to larger image sizes as observed in Section 1431 reduces 
the need for studies of larger systems. Since it appears that the relation between 
relative sparsity and relative sampling for recovery holds across the image size iV s id e , 
we do not need to study larger iV s id e but can simply extrapolate the relation to more 
realistic values of AT s id - 

5.2. Future work. The RSS diagram allows for generalization to increasingly 
more realistic set-ups. For example, other phantom classes can be considered with 
sparsity in other representations, such as in the gradient, and the penalty function can 
be changed to enforce the expected kind of sparsity. Noise and inconsistencies can be 
introduced in the data, the equality-constraint can be replaced by data misfit bounds 
controlled by rcgularization parameters, and the system can be changed (e.g., to a 
limited angle CT problem) . Making such generalizations might require modifications 
of the sparsity measure and the recovery criterion. 

Earlier studies of TV reconstructions [T5] seem to show a relation between gradient 
image sparsity and sufficient sampling, but due to the complexity of the test problems 
we found it difficult to understand and interpret the influence of each experiment. An 
investigation based on the RSS diagram could provide more structured insight. For 
instance, we might learn that TV reconstructions of a class of "blocky" phantoms 
exhibit a well-defined recovery-curve similar to the one in the present study. 

We always face the problem of possible non- unique solutions to LI, leading to 
RSS and DT phase diagrams that depend on the particular choice of optimization 
algorithm. To that end, we introduced the weak recovery RSS diagram which is 
independent of the optimization algorithm, and we found that in most cases the weak 
recovery curves provide tight lower bounds on the strong recovery curves. We are 
therefore fairly confident that the MOSEK results provide an accurate picture of 
the recoverability. It would be interesting to also determine a transition curve for 
the existence of a unique solution, since such a curve provides an upper bound the 
strong recovery curve. With a unique solution, the choice of optimization algorithm 
is not important as long as a robust optimization algorithm is used. We expect that 
Ll-uniqueness can be investigated by numerically verifying a set of necessary and 
sufficient conditions |15j . We did not pursue that idea in the present work in order 
to focus on an empirical approach easily gencralizablc to other penalties, such as TV, 
for which similar conditions are not available. 

Another interesting future direction is to study the in-class recovery variability, 
i.e., why the 2.0-power class transition from non-recovery to recovery is more grad- 
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ual. Would it be possible to identify differences between instances that were recovered 
and ones that were not. e.g., in the spatial location of the non-zero pixels, or in the 
histograms of pixel values? This could lead to subdividing the phantom class into par- 
titions each having sharper transitions occurring at different relative sampling values 
and thereby an even better understanding of what factors influence the recoverability. 

6. Conclusion. Inspired by the Donoho- Tanner phase diagram we devised a 
relative sparsity-sampling diagram for studying empirical recoverability in sparsity- 
exploiting X-ray CT image reconstruction. Our approach is not limited to sparsity in 
a specific domain or reconstruction by solving a specific optimization problem. We 
focused on sparsity in the pixel domain and reconstruction using the ^i-norm penalty. 

Our numerical simulations demonstrate a pronounced relation between image 
sparsity and the number of projections needed for recovery, for a range of image 
classes without and with structure and classes with signed and un-signed pixel values. 
In the majority of the studied cases, we found a sharp transition from non-recovery to 
recovery — a result that hitherto, to our knowledge, has not been established for CT. 
The sharp transition allows for quantitatively predicting the number of projections 
that, on average, suffices for accurate £i-recovery of phantom images from a specific 
class, or conversely, to determine the maximal sparsity of an image that can be re- 
covered for a certain number of views. With these initial results we have taken a step 
towards better quantitative understanding of the recoverability from undcrsampled 
measurements in X-ray CT, and additionally we provide a tool for determining similar 
answers for increasingly realistic systems. 
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